Genome-wide synteny through highly sensitive sequence alignment: Satsuma

نویسندگان

  • Manfred G. Grabherr
  • Pamela Russell
  • Miriah D. Meyer
  • Evan Mauceli
  • Jessica Alföldi
  • Federica Di Palma
  • Kerstin Lindblad-Toh
چکیده

MOTIVATION Comparative genomics heavily relies on alignments of large and often complex DNA sequences. From an engineering perspective, the problem here is to provide maximum sensitivity (to find all there is to find), specificity (to only find real homology) and speed (to accommodate the billions of base pairs of vertebrate genomes). RESULTS Satsuma addresses all three issues through novel strategies: (i) cross-correlation, implemented via fast Fourier transform; (ii) a match scoring scheme that eliminates almost all false hits; and (iii) an asynchronous 'battleship'-like search that allows for aligning two entire fish genomes (470 and 217 Mb) in 120 CPU hours using 15 processors on a single machine. AVAILABILITY Satsuma is part of the Spines software package, implemented in C++ on Linux. The latest version of Spines can be freely downloaded under the LGPL license from http://www.broadinstitute.org/science/programs/genome-biology/spines/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Draft Sequencing of the Heterozygous Diploid Genome of Satsuma (Citrus unshiu Marc.) Using a Hybrid Assembly Approach

Satsuma (Citrus unshiu Marc.) is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma ("Miyagawa Wase") was conducted by a hybrid assembly approach using short-read sequences, three mate-pair libraries, and a long-read sequence of PacBio by the ...

متن کامل

The UniMarker (UM) method for synteny mapping of large genomes

MOTIVATION Synteny mapping, or detecting regions that are orthologous between two genomes, is a key step in studies of comparative genomics. For completely sequenced genomes, this is increasingly accomplished by whole-genome sequence alignment. However, such methods are computationally expensive, especially for large genomes, and require rather complicated post-processing procedures to filter o...

متن کامل

Homologous synteny Block Detection Based on Suffix Tree Algorithms

A synteny block represents a set of contiguous genes located within the same chromosome and well conserved among various species. Through long evolutionary processes and genome rearrangement events, large numbers of synteny blocks remain highly conserved across multiple species. Understanding distribution of conserved gene blocks facilitates evolutionary biologists to trace the diversity of lif...

متن کامل

Whole Genome Alignments and Synteny Maps

IINTRODUCTION It was not until closely related organism genomes have been sequenced that people start to think about aligning genomes and chromosomes instead of short DNA sequences. There are a large number of algorithms and methods developed for sequence alignment, in order to find conserved sequences across species. Smith & Waterman and Needleman & Wunsch algorithms are among the most popular...

متن کامل

Genome evolution-aware gene trees

A gene family tree is traditionally inferred from a multiple alignment of homologous sequences according to a model of sequence evolution. Trees for several genes families are thus constructed independently from each other. They often carry unresolutions or bad resolutions. Information for their full resolution may lie in the poorly exploited dependency between gene families, each bringing info...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 26 9  شماره 

صفحات  -

تاریخ انتشار 2010